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We consider the problem of decentralized estimation using wireless sensor networks. Specifically, 
we propose a novel framework based on level-triggered sampling, a non-uniform sampling strategy, and 
sequential estimation. The proposed estimator can be used as an asymptotically optimal fixed-sample- 
size decentralized estimator under non-fading listening channels (through which sensors collect their 
observations), as an alternative to the one-shot estimators commonly found in the literature. It can also 
be used as an asymptotically optimal sequential decentralized estimator under fading listening channels. 
We show that the optimal centralized estimator under Gaussian noise is characterized by two processes, 
namely the observed Fisher information Ut, and the observed correlation V t . It is noted that under non- 
fading listening channels only Vt is random, whereas under fading listening channels both Ut and Vt 
are random. In the proposed scheme, each sensor computes its local random process(es), and sends a 
single bit to the fusion center (FC) whenever the local random process(es) pass(es) certain predefined 
levels. The FC, upon receiving a bit from a sensor, updates its approximation to the corresponding 



global random process, and accordingly its estimate. The sequential estimation process terminates when 
the observed Fisher information (or the approximation to it) reaches a target value. We provide an 
asymptotic analysis for the proposed estimator and also the one based on conventional uniform-in- 
time sampling under both non-fading and fading channels; and determine the conditions under which 
they are asymptotically optimal, consistent, and asymptotically unbiased. Analytical results, together 
with simulation results, demonstrate the superiority of the proposed estimator based on level-triggered 
sampling over the traditional decentralized estimator based on uniform sampling. 

Index Terms: Decentralized estimation, level-triggered sampling, observed Fisher information, asymptotic opti- 
mality, sequential analysis. 
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I. Introduction 

Decentralized parameter estimation is a fundamental signal processing task that can be realized in 
wireless sensor networks. Due to the stringent bandwith and energy requirements imposed by sensors it 
is typically performed under the constraints of low bandwith usage and low communication rate. That is 
to say, sensors need to infrequently communicate to the fusion center (FC) in an FC-based network (to 
the neighboring sensors in an ad hoc network similarly), consuming low bandwith, e.g., sending only a 
few bits each time. 

In an FC-based network, which we consider in this paper, there are two types of communication 
channels, namely listening channels through which sensors obtain their local observations, and reporting 
channels through which sensors transmit information bits (or analog signals) to the FC. Various types of 
reporting channel have been analyzed in the literature. For instance, Q-121 assume orthogonal (parallel) 
error-free channels; 0, Q assume orthogonal non-fading continuous channels; (H, O assume orthogonal 
discrete channels (BSC); ifTOl , iTTTI assume non-fading multiple access channel (MAC); and finally |[T2l - 
lfT31 assume fading MAC. In this paper, we will assume orthogonal error-free channels to focus on the 
proposed novel framework for decentralized estimation, which is described and analyzed in the following 
sections. However, for the listening channel there is no such a diversity among the existing works. To 
the best of our knowledge, only non-fading channel (deterministic listening channel gains) has been 
considered. In this paper, we will, for the first time, analyze fading listening channels (random listening 
channel gains), which correspond to random observed Fisher information, complicating the design and 
analysis. Moreover, our analysis will be based on the general case of complex listening channel gains and 
noise, whereas most of the existing works assume real listening channel gains and noise, e.g., lfTl- |fl6l 
except for |[T3l . Another common assumption among the existing works in the literature is the identically 
distributed noise or noise with same statistics, e.g., lfT1- lfT6l except for H, Q and lfl4l . In this paper, 
we also avoid making such an assumption. 

Some of the decentralized estimators proposed in the previous works are universal in the sense that 
they do not depend on the probability density function (pdf) of the listening channel noise, e.g., H, ||6*1 . 
lfl4l . |fl6l . The estimators in (4| and |[l6l are also independent of the network size and the sensor index, 
i.e., robust to changes in the network size (sensor addition/failure), which is a practically desired feature 
for decentralized estimators ifTTI . Similarly, our estimators are robust in that sense (although different 
thresholds are assumed at each sensor as a general case, the derivations and analysis also cover the 
specific case of using the same threshold). Most of the estimators in the literature, including the ones in 
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the references above, except for [4] and lfl6ll as already noted, are not robust in that sense. 

In decentralized estimation, sensors use either digital or analog transmission to send their observations 
to the FC. In order to conform to the low bandwith requirement sensors either quantize their observations 
with a small number of bits, such as 1 bit, (e.g., (2, O, O) or appropriately pulse-shape their analog 
transmissions (e.g., lfT^ - lTT4l ). Quantization with a small number of bits causes the observations to 
be recovered in a coarse resolution at the FC, although it is much easier to implement than analog 
transmission. Dithering is used in @ to reduce the bias and improve the consistency of a quantization- 
based-estimator. In |fT8l , it is shown that random dithering can significantly reduce the Cramer-Rao lower 
bound (CRLB) compared to the no dithering case. Moreover, in fT9l , deterministic dithering is shown 
to be optimal in terms of minimizing the CRLB. 

All of the references above, except for |fl8ll , perform fixed-sample-size (one-shot) estimation. However, 
as stated in lfl8l . it is not possible in fixed-sample-size estimation to further refine the quality of the 
estimate before and after the estimation time, unlike the sequential estimation. Moreover, it is natural to 
expect that sequential estimators require significantly less number of samples than their fixed-sample-size 
counterparts to achieve the same quality of estimate, as it is known that sequential detection methods, 
on average, requires approximately four times less samples than their fixed-sample-size counterparts 
for the same level of confidence |[20l Page 109]. Hence, in this paper we are interested in sequential 
decentralized estimators rather than fixed-sample-size ones. In addition, we will show in the following 
sections that sequential estimation is inevitable when sensors collect their observations through fading 
channels, i.e., when the listening channel gains are random. There are a few works considering the 
sequential decentralized estimation in the literature, e.g., |[T8l . 11211 -1231. in which sensors employ the 
conventional uniform-in-time samplers to sample and transmit their local observations. On the other 
hand, similar to [24], in this paper we will consider using level-triggered sampling, a non-uniform 
sampling strategy, which perfectly fits to transmitting information in decentralized systems as recently 
shown in Il25l - ll27l . Level-triggered sampling, eliminating the need for quantization, naturally outputs 
1-bit information, which upon transmission produces a high quality recovery at the FC with a very 
fine resolution (even full resolution if sensors observe continuous-time signals with continuous paths). 
Hence, the level-triggered-sampling-based information transmission, sending 1 bit per sample, enjoys the 
simplicity of digital transmission, and at the same time it is as powerful as analog transmission producing 
fine resolution recovery. Furthermore, it provides censoring of unreliable observations, similarly to 0. 
The decentralized estimators in Hl-O, Q involve iterative procedures for solving convex optimization 
problems. It is concluded in [I] that under relaxed bandwith constraints the simple-minded quantized 
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sample mean estimator (QSME), in which sensors simply send their quantized observations to the FC, 
should be preferred over some more complex estimator. Our level-triggered-sampling-based estimators 
are as simple as QSME, and are designed under strict bandwith constraints. 

In this paper, we use the standard notation to denote the types of convergence of random variables, e.g., 

d p as Tj' n 

— >, — >, —i and — > denote convergence in distribution, convergence in probability, almost sure convergence 
and convergence in the n-th order moment, respectively. Throughout the paper, E[-] and Var(-) denote 
expectation and variance, respectively. We also use the asymptotic notations o(-), O(-), G(-), and 
in their standard definition. Particularly, o(l) represents a term that tends to 0, and 0(1) represents a 
constant term. Although asymptotic notations signify only the order, and ignores constant factors, e.g., 
O(logX) = 0(logVT), we will express constant factors, i.e., differentiate O(logV^) from O(logX), to 
more accurately compare the competing schemes. 

The remainder of the paper is organized as follows. We formulate the decentralized estimation problem 
and provide the necessary background information in Section JI] The optimal centralized estimator and 
decentralized estimators that we propose are described in Section HH] and Section [TV] respectively. In 
Section [V] asymptotic performances of the proposed decentralized estimators are analyzed. Finally, we 
give simulation results in Section [Vl] and conclude the paper in Section I VII I 

II. Problem Formulation and Background 

Consider the problem of estimating a non-random parameter, x 6 M, at a central unit, i.e., the fusion 
center (FC), via noisy observations collected at K distributed nodes, i.e., sensors. Let i£, t G N, k = 
1, . . . , K, denote the discrete-time noisy sample observed by the fe-th sensor at time t, given by 

y k = xh k t + w k t , (1) 

where x is the constant parameter to be estimated, h k G C is the channel gain, random in general, 
and observed by the A;-th sensor, and w\ ~ J\f c (0,a%) is the complex Gaussian noise assumed to be 
independent and identically distributed (i.i.d.) across time and independent but not necessarily identically 
distributed across sensors. Accordingly, given h\ we have y\ ~ M c {xh k , cr?), i.e., y\ is conditionally 
Gaussian. Note that random h k , in the general case, corresponds to the fading channels. We will also 
consider the additive white Gaussian noise (AWGN) channels, where h\ = hf., Vk,t, as a particular 
case. 

If sensors transmit their observations in whole by using infinite number of bits, then the FC will have 
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access to all local observations {yt}t,k □> which corresponds to the conventional centralized estimation 
problem. However, in practice, due to power and bandwith constraints, sensors typically sample their 
observations and transmit only a few bits per sample to the FC. In such decentralized setup, the FC 
can only obtain a summary of local observations based on which it performs estimation. Obviously, 
the performance of a decentralized estimator depends on how comprehensive the summary, that the 
FC receives, is. In other words, the sampling and quantization strategies at sensors, and the fusion rule 
employed by the FC determine the performance of a decentralized estimator. Since under ideal conditions 
(i.e., no sampling and infinite -precision quantization) the decentralized estimator becomes the centralized 
one, the optimal performance of the centralized estimator is a benchmark for decentralized estimators. 
Hence, we will first analyze the optimal centralized estimator. 

Let Hi denote the set of channel gains observed at the A;-th sensor up to time t, i.e., = {h^} T 
Define also H t = In this paper, we are interested in an estimator (centralized or decentralized), 

x t , of x, that is conditionally unbiased, i.e., E[x t |% 4 ] = x, Vt, and in minimum time achieves a specified 
target accuracy in terms of the squared error loss, i.e., (xt — x) 2 < l/I. Since x is unknown, we need 
to estimate the true squared error to assess the accuracy of the estimator. In general, the mean squared 
error (MSE), E [(x t — x) 2 ] = Var(xj), is used to estimate the true squared error. In this paper, we will 
use the conditional variance, Var(x t |% t ), in the presence of an ancillary statistic Tit ESI - Note that 
Var(x() = E [Var (x t \Ht)], and whenever Var (xt\Ht) itself is available, there is no need to use its mean. 
Hence the conditional variance is a better (in fact the best |29l ) estimate of the true squared error than 
the unconditional variance. Thus, we aim to find the conditionally unbiased estimator, x t , that satisfies 
the following inequality, 



where T, given in ([3]), is the minimum time for any conditionally unbiased estimator to achieve the target 
accuracy l/I. 

The Cramer-Rao lower bound (CRLB), defined using the Fisher information I t , provides the minimum 
variance for an unbiased estimator of x at time t, i.e., Var(x t ) > CRLB = l/I t |[20l pp. 171]. Given Ht, 
we can define the conditional Fisher information, I 4 C , and accordingly the conditional CRLB, 1/Jf, as in 
POl . Then, similarly we have Var (xt\Ht) > V-^t- Assuming a conditionally efficient estimator, which 

'The subscripts t and fc in the set notation denote t G N and k = 1, . . . , K, respectively. 
2 We use the subscript r in the set notation to denote r = 1, . . . , t. 
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achieves Vai^x-]-\Hj-) = I/-Z7-, from (O the optimal estimation time (stopping time) T is given by 

T = min{£ GN : If >1}, (3) 

as in OTTl . 

Note that conditional unbiasedness and efficiency imply unconditional unbiasedness and efficiency, 
respectively, but not vice versa. Imposing the condition in © we want to satisfy the target accuracy 
at each realization, which is a stricter requirement than its unconditional counterpart Var(x7-) < ^ 
aiming to satisfy the target accuracy only on average. We will next analyze the conditional maximum- 
likelihood estimator (MLE), which will be shown to be conditionally unbiased and efficient, as the optimal 
centralized estimator. 



III. Optimal Centralized Estimator 

In the centralized setup under fading channels, the A;-th sensor transmits both {y^}t and {h k } t to the 
FC by using infinite number of bits, hence both {yt}t,k and {h k }t,k are available to the FC. Note that 
under fading channels {yt\h h } t (across time) are independent, but not identically distributed (i.n.i.d.), 
and similarly {yt\h h }k (across sensors). Under AWGN channels {yt}k are i.n.i.d. (across sensors), but 
{yt}t are i.i.d. (across time). 

Hence, in general, due to the independence across sensors and time, the conditional log-likelihood L t 
of the global observations up to time t, {y k } T ,k, is given by 



K K 



Lt = YA = Y,YA where l» 4 _ M f^ -logwg (4) 



fc=l fc=lr=l k 

,k „;,,„„ uk 



is the conditional log-likelihood of a single observation y^. given h T . The conditional score function, 
St = ^L t , is then written as 

St ^^ 2j?t((^r^)-2x|^|^ (5) 

k=l r=l ° k 

where 9f(0» an( ^ (')* denote the real part, the imaginary part, and the complex conjugate of a 

complex number, respectively. Next, we write the conditional observed Fisher information, XJ t = —-^Su 
as 

k=l k=l T=l fc 
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The conditional MLE, x t , maximizes L t , hence we have St(xt) = 0. From ©, x t is then given by 

k=l r=l ^ fe=l 

We can rewrite |5) as St = — xfJt. Dividing both sides by C/j and using (|7]) we get 

S 

x t = x + —. (9) 

Writing © explicitly as V* = £f=i £* =1 gMM^lp^M^ , a nd noting that K{y k T ) ~ Af(xtt(h*), §), 
%(y*) ~ AA(x9(/i^),4) given we have 14 ~ M{xU t ,U t ), and thus $ ~ Af(0,U t ) given ft t . 
Therefore, from d9]) we have 

x t \Ut~M(x,llU t ). (10) 

From the definition of the Fisher information, I t = E[5|] = E[[/f], we write the conditional Fisher 
information as 

I{ = E[U t \H t ]=U t = j2E 2 -^-- (ID 

fe=l T=l ^ 

Hence, we have the following result for the conditional MLE. 



Lemma 1. The conditional MLE, xt, given in (0, is conditionally unbiased, i.e., E = x, consistent, 

i.e., x t A- x given Ti t , and efficient, i.e., Var {xt\Ht) = l/^t c » Vi. 

Proof: The proof is given in Appendix A. ■ 
Note that in the particular case of AWGN channels, where we have h\ = hk, Vr, all results obtained 
conditional on % t until now, including Lemma [TJ are valid only in their unconditional forms since hf., Vfc, 
is deterministic and known. Hence, in this case the Fisher information, I t = Ylk=l , is deterministic. 
Consequently, the optimal stopping time, T, defined in ([3]), is also deterministic and given by 

1 



T = tx 

where [•] is the ceiling operator. Hence, we have the following corollary. 



sr^K 2\h k \ 2 



(12) 



Corollary 1. The fixed-sample-size MLE xt x , which has a variance of 1/It x (cfi Lemma\I]), is the optimal 
centralized estimator under AWGN channels in terms of the objective in (ff]). 

Under fading channels, however, the conditional Fisher information If in (fTT)) . and accordingly the 
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optimal stopping time T in © are random. Hence in this case, we consider a sequential conditional 
MLE, (T,xj-). In ll32l pp. 96], for non-i.i.d. observations, the use of CRLB was extended to sequential 
estimators. We can further extend it to sequential conditional estimators as stated in the following lemma 
without proof. 

Lemma 2. The conditional variance of a sequential estimator (T, x-y) that is conditionally unbiased, 
i.e., E [xt\Ht\ = x > and with a random stopping time T, is lower bounded by the conditional CRLB, 
i.e., 

Var(x T \n T ) > ^r- (13) 
Then, we can write the following corollary for the fading case. 

Corollary 2. The sequential MLE (T,x-r), having a conditional variance of l/I^-> is the optimal 
centralized estimator under fading channels in terms of the objective in (0. 

Proof: It suffices to show that E [(xt — x) 2 \Ht\ = V^T - Note that we can write 

oo oo 

E E i& - z) 2 i {t =T}N = E E i& - x ) 2 N %=t}, 

t=0 t=Q 

where lr.i is the indicator function, since Ut depends only on % t and having J t c = Ut from (TTTT > the 
event {T = t} is deterministic given T-L t [cf. ©]. From Lemma [T] we have E \{x% — x) 2 |^ t ] = 1/1%, 
hence 

oo 

E[(x r -x) 2 |^ r ]=E^ 1 {*=r} = F ' 
t=o * T 

which concludes the proof. ■ 
Note that we were able to obtain the optimal sequential estimator, that achieves the sequential CRLB, 
since our stopping time T depends only on the channels, i.e., Ut, but not on the observations {ut}. In 
general, for a stopping time that also depends on the observations it was shown in [33 1 that the sequential 
CRLB is not attainable under Gaussian distribution, but it is attainable only under Bernoulli distribution. 
In the following section, following the optimal centralized estimators in Corollary Q] and Corollary |2j 
we will propose decentralized estimators based on either the level-triggered sampling or the traditional 
uniform-in-time sampling. And in Section [V] we will analyze the conditions under which the decentralized 
estimators given in Section [TV] achieve asymptotic unbiasedness, consistency and asymptotic optimality. 
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IV. Decentralized Estimators 

In this section, we will develop decentralized estimators, (7", x-f), by imitating the optimal centralized 
estimators given in the previous section. We will start with the case of AWGN channels, and then continue 
with the general case of fading channels. 

A. AWGN Channels 

Note that the optimal centralized estimator is computed using both Ut and Vt [cf. ©], whereas the 
optimal stopping time is determined using only Ut [cf. ® and {H)]. In this case, since we have h k = 
hk, Vt, from ©, Ut = J2k=i ^o'* ' * s deterministic, and thus can be known by the FC beforehand. 
Hence, the optimal stopping time T is deterministic and given by (fT2l) . In other words, under AWGN 
channels the fixed-sample-size decentralized estimator x tl is of interest. In a decentralized system, Vt 
given in © is a random process unlike Ut, and thus is not readily available to the FC. From Corollary 
[U and (0, we see that V t% is a sufficient statistic for optimally estimating x, hence sensors should report 
{V t k } k to the FC. This can be done either sequentially or once at the optimal stopping time, tx, using 
the same number of bits in total on average. 

The sequential approach, by its nature, has a number of advantages in practice over the fixed-time 
approach. Firstly, in the sequential approach, early estimates before the stopping time, i.e., {x t : t < tx}, 
are available, although they are not as accurate as the final estimate xt x ■ This is a useful feature especially 
when tx is large. Secondly, in the sequential approach, each sensor sends several small messages to the 
FC until tx, requiring significantly less bandwith than sending a single large message at time tx in the 
fixed-time approach. Moreover, in the fixed-time approach there is a possibility of congestion at the FC 
due to the burst of bits received at time tx- 

In this paper, following the sequential approach we propose a decentralized MLE based on level- 
triggered sampling, which we call LT-DMLE. Note that LT-DMLE is still a fixed-sample-size estimator 
despite the fact that it sequentially reports {V t ^}k to the FC. We will describe first the conventional 
decentralized MLE (DMLE) following the fixed-time approach, and then LT-DMLE. 

1) DMLE: We assume that the parameter to be estimated is bounded, i.e., |x| < X, and so does the 
term 2 3MY]}5) m ©, i. e ., 2 W)' h i) < . < ^ y k t HencC) from jg) we have \yk\ < tj .A In 

DMLE, each sensor k uniformly partitions the interval (—tx<ft, tx<p) into 2^ subintervals; following the 
fixed-time approach, at time tx, quantizes V t * into using bits; and transmits the quantized bits 
to the FC. The quantization rule can be either deterministic (e.g., a traditional mid-riser quantizer), or 
randomized (e.g., the quantizer in f26T ). 
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The FC, upon receiving R k bits from each sensor at time t%, recovers V£, k = 1,...,K, and then 
computes 

K 



Finally, similar to © the estimate 



fe=l 



Vf 

xt x = (15) 

U tx 



is formed 

2) LT-DMLE: For LT-DMLE, following the sequential approach, we propose that each sensor k, via 
level-triggered sampling, informs the FC whenever considerable change occurs in its local process V k . The 
level-triggered sampling is a simple form of event-triggered sampling, in which sampling (communication) 
times {t^ y}nLl are not deterministic, but rather dynamically determined by the random process V k , i.e., 

t k ny ± min{t > t k _ iy : V t k - Vjt^ (-d k , d k )}, n G N, t k y = 0. (16) 

The threshold parameter d k is a constant known by both sensor k and the FC. 

At each sampling time t k n v , sensor k transmits ry bits, b\ x b k 2 • • • i( rr , to the FC. The first bit, b\ x , 
indicates the threshold crossed (either dj. or — d k ) by the incremental process v k = V, k — V. k , i.e., 

l n,V t n — 1,V 

b k A = sign(^). (17) 

The remaining (ry — 1) bits, b\ 2 . . . b\ r , are used to quantize the over(under)shoot q k = \v k \ — d k into 
q k . At each sampling time v , the overshoot value q k cannot exceed the magnitude of the last sample 
in the incremental process v k = Y^r=t k +1 Ht \ i.e., < q k < (f>. Hence, the interval [0, (j>) 

is uniformly divided into 2 Tv ~ 1 subintervals. The quantization rule can again be either deterministic or 
randomized. 

Note that if V k were a continuous-time process with continuous paths, then it would exactly hit the 
thresholds, i.e., no overshoot would occur, and thus no quantization bits would be needed, i.e., ry = 1. 
The threshold parameter d k is determined so that the A;-th sensor, up to time t%, transmits on average R k 
bits to the FC, i.e., communicates to the FC on average ^ times. 

The FC, upon receiving the bits frjj, 2 b k l Tv from the sensor k at time v , recovers the quantized 
value of v k by computing 

v h n ^b k (d k + q k ). (18) 



3 DMLE corresponds to the quantized sample mean estimator (QSME) in fTl . 
4 The subscript n in the set notation denotes n G N. 
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Then, it sequentially sums up {v k } Hj k, at the sampling (communication) times {i^ v } n ,k to obtain an 

approximation Vt to the sufficient statistic Vt, i.e., 

K Nj° k 

fc=ln=l fc=l 

where is the number of messages that the FC receives from the sensor k about V t up to time t. 
During the times the FC receives no message, i.e., t {i^ y} n ,k> Vt is kept constant. Replacing Vt with 
V in ([7]) the following decentralized estimator, 

V t 

Zt = jf, (20) 
Ut 

is obtained at the FC. Finally, the scheme stops at time 7~ = t% [cf. (fl2l )l after computing the final 
estimate x tx = j?^. 

B. Fading Channels 

Under fading channels, U t is random, hence sensors should report both {V t k } k and {Uj*} k to the FC. 
In this case, only the sequential approach can be used to report {U k }k to the FC since the stopping 
(optimal estimation) time, T, is random. A straightforward way to sequentially report {U k }k is to use 
a conventional uniform-in-time sampler followed by a quantizer. Alternatively, level-triggered sampling 
can be employed, which has certain advantages over the uniform-in-time sampling, as will be shown 
in Section [Vj On the other hand, {V k }k, as in the AWGN case, can be reported to the FC either 
sequentially or once at time T, when the process stops. Hence, we propose two sequential decentralized 
MLEs based on level-triggered sampling, and two based on uniform sampling. In the first group of 
estimators, {U k }k are sequentially reported, but {V k } k are reported once at time T, hence the names 
LT-sDMLE (level-triggered sampling based sequential DMLE) and U-sDMLE (uniform sampling based 
sequential DMLE) are used. In the second group, both {U k }k and {V t k }k are sequentially reported, hence 
we name the estimators LT-dsDMLE (level-triggered sampling based doubly sequential DMLE) and U- 
dsDMLE (uniform sampling based doubly sequential DMLE). We next explain these four estimators in 
detail. 

1) LT-sDMLE: In LT-sDMLE, sensors sample only {U k }k via level-triggered sampling at the following 
sampling times, 

t^u 4 min{t > t k n _ w : U k - > e k }, n G N, t k 0>u = 0, (21) 

where the threshold e k is a constant chosen by the designer and made available to the FC and sensor k. 
Note that in (l2Tb we use a single threshold different from ( fT6l ) since U k , given in ©, is a nondecreasing 
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process. At each sampling time t k n v , sensor k transmits m bits to the FC, all of which are used to 
quantize the overshoot Pn — u k — e k into p k . In this case, we do not need to allocate a sign bit. Assume 

2\h k \ 2 h 

l Ji < 9 < oo, \/k,t, hence we have < p n < 8, and the interval [0,6) is uniformly partitioned 
into 2 ru subintervals. In other words, each sensor k, by using a deterministic or randomized quantizer, 
determines the index of p k , and then transmits it to the FC using rjj bits. Finally, when the scheme is 
terminated by the FC at the random stopping time T, each sensor k, as in DMLE, by using bits 
quantizes Vj- into Vj-, which is then transmitted to the FC. 

The FC, upon receiving the rjj bits at time t* v , similar to (TT8T ) computes 

u k n = e k +pt (22) 

Then, similar to ( fl9l ) it also computes 

K M t k K 
fe=ln=l k=l 

where M k is the number of messages that the FC receives from sensor k about U k up to time t. The 
scheme is terminated at the stopping time, T [cf. ©, (fTTIll. given by 

f = min{t G N : U t > 1}. (24) 

Finally, the FC, as in DMLE, upon receiving R^ bits from each sensor at time T, recovers V£, Vfc, and 
computes Vf- = J2k=i Vf-> as wen as tne estimate Xj- = 

2) LT-dsDMLE: In LT-dsDMLE, there are two different sets of sampling times, namely {yt jj} n k and 
{^n v}n,k- Each sensor k, as in LT-sDMLE, at time t k n v [cf. ((2Tb l quantizes p k into p k n , and transmits 
r;7 bits to the FC until the stopping time T. Similarly, each sensor k, as in LT-DMLE, at time v [cf. 
([Tol l quantizes into v k , and transmits ry bits to the FC until the stopping time T '. 

The FC computes «^ at time v as in (1221 . and v k at time t* v as in ( fT8l ). Then, it obtains U t and 
Vt as in d23l ) and ( fl9l ), respectively. Next, similar to (1201) the following estimator, 

V* 

x t = ^, (25) 

is formed. Finally, the FC terminates the process at time T, given by (l24l . immediately after the final 
estimate x~- = -J 1 is computed. 

U-sDMLE: In U-sDMLE, each sensor A; uniformly samples {U k }k with period Try, i.e., at times 
{mTu} me fii. Specifically, it computes the incremental process u k mTu = U^ Tu — Ut^^ at time mT\j. 
Since u^ T(7 G [0,Tjj9), the interval [O,Tr/0) is uniformly divided into 2 ru subintervals. Then, at time 
mTjj, it quantizes into , and transmits the corresponding quantization level index to the FC 
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using rjj bits. When the process stops at time T, each sensor k, as in DMLE and LT-sDMLE, quantizes 
V~L into V~L using bits, and then transmits the quantization bits to the FC. 

The FC, at time mTu, computes v^ mTu using the received ru bits. Then, similar to d23l ), it computes 

K M t 

ft = EE4. (26) 

fc=l m=l 

where M t = [t/Tjj\ is the number of sampling (communication) times, until time t, for {U k }k, and 
[•J is the floor operator. At time T, given in (l24l) . the FC, as in DMLE and LT-sDMLE, terminates the 
process; recovers V^. upon receiving bits; and finally computes Vf- and the estimate Xf- = -J 1 . 

4) U-dsDMLE: We also have two sets of sampling times in U-dsDMLE, for {U k } k and {V t k } k , that 
are uniform in time with periods Tjj and Ty, respectively, i.e., {mTjj} m and {mTy} m . At time mTu, 
as in U-sDMLE, each sensor k computes u^rp ; quantizes it into w^ T(7 using rjj bits; and transmits 
the quantization bits to the FC. Similarly, at time mTy, each sensor computes the incremental process 

<T V = V mT v ~ V f m -i)T v > for which we have u mT v G {-Ty^Tycf)). The interval {-Ty^Tv4,) is 
uniformly divided into 2 rv subintervals, and at time mTy the index of the quantization level corresponding 
to v^j, is transmitted to the FC using ry bits. 

The FC, as in U-sDMLE, computes u k mTu at time mTjj, and also U t given by ( |26l ). Similarly, at time 
mTy, it computes using the received ry bits. Next, similar to ( fT9l ), it computes 

fc=l m=l 

where N t = [t/Ty\ is the numbers of sampling times, until time t, for {V t k }k- Using the approximations 
in d26l ) and J27b , the estimator x t is computed as in d25l ), at time t. The stopping time of the scheme is 
given by d24l) . 

V. Performance Analysis 

In this section, we will derive the conditions under which the decentralized estimators outlined in the 
previous section are, given Hj, asymptotically unbiased, i.e., E[x^|%^] — > x, consistent, i.e., Xj- A- x, 
and asymptotically optimal. An estimator xt is said to be asymptotically optimal if y/T t {xt — x) converges 
in distribution to a standard Gaussian random variable, i.e., y/Tt(xt — x) -4 J\f(0, 1), as t — > oo |[20l pp. 
185]. In our case, we let the target Fisher information X tend to infinity, thus for asymptotic optimality 
we need to show that 

JHf.(x t -x)Atf(p,l), (28) 
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given Ti^-. Note that asymptotic optimality, which is related to the probability distribution, does not imply 
asymptotic efficiency, i.e., E[(xf — x) 2 \T-Lf \ — > 1/1%, which is related to the second moment. 

A. AWGN Channels 

In this subsection, we will drop the subscript V in t k n v since sensors sample only {V t k }k in the AWGN 
case. The following theorem gives the conditions under which DMLE, following the fixed-time approach, 
is asymptotically unbiased, consistent, and asymptotically optimal. 

Theorem 1. The decentralized estimator DMLE, given in Section \IV-A\ is, as I — > oo, asymptotically 
unbiased, i.e., E[xt x — x] — > 0, and consistent, i.e., xt x -A x, if Rk — > oo at any rate, Mk. It is also 
asymptotically optimal, i.e., \J~h x {xt x ~ x ) ~> A/"(0, 1), if Rk — > oo at a faster rate than \ogVX, i.e., 
Rk = w(log\/Z), \/k. 

Proof: The proof can be found in Appendix B. ■ 
Now, we proceed to analyze LT-DMLE, that follows the sequential approach to report {V t k }k, but 
is still a fixed-sample-size estimator. The next two theorems give the conditions for LT-DMLE to be 
asymptotically unbiased, consistent, and asymptotically optimal. 

Theorem 2. Consider the decentralized estimator LT-DMLE, given in Section \IV-A\ It is, as I — > oo, 
asymptotically unbiased, i.e., E[xt x — x] — > 0, and consistent, i.e., Xt x Ax, if dk — > oo at a slower rate 
than X, i.e., dk = o(Z), Vfc. 

Proof: The proof is presented in Appendix C. ■ 

Theorem 3. The decentralized estimator LT-DMLE, given in Section \IV-A\ is, as I — > oo, asymptotically 
optimal, i.e., ^JTr~ x {xt x — x) — > Af(0, 1), if dk = o(VT) and ry = co (log(VT /d k )), Vk. 

Proof: The proof is provided in Appendix D. ■ 
The analytical results obtained for AWGN channels in the preceding theorems are summarized in Table 
U Note that there are two sources of discrepancy in the sequential estimator based on level-triggered 
sampling, LT-DMLE. One source is the discrepancy in the messages, i.e., overshoot quantization error, 
represented by the first terms inside the parentheses in d33l ) and ( f35T ). The other source is the missing 
statistics at the FC, between the last sampling times of the sensors and the stopping time, represented 
by the second terms inside the parentheses in (|33l ) and d35l ). Note that if {V t k }k were continuous-time 
processes with continuous paths, then only the second source of discrepancy would exist. Having the 
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Asymptotic unbiasedness & Consistency 


Asymptotic optimality 


DMLE 


dk — > co so that dk — o(X), Vfe, and rv = 0(1) 
(Thm.0 


dk — y co so that dk = o(X/ log yT), and ry — 
O(l) (Thm.[D 


LT-DMLE 


dk — > oo so that dk — o(X), Vfc, and rv = 0(1) 
(Thm.|2j 


dk — ^ co so that dk = o(VT), and ry — > co 
so that r v = uj{\og(y/T/d k ))yk (Thm.O 


dk : sampling threshold in LT-DMLE, rv '■ number of bits sent with each message in LT-DMLE 



TABLE I 

Conditions for LT-DMLE and paraphrased conditions for DMLE 



sampling threshold, dk — > oo, as I — > oo, de-emphasizes the first source since the number of messages 
decreases, and so does the accumulation of the overshoot quantization error. However, having dj. — > oo 
emphasizes the second source since the sampling intervals increase, and so do the missing statistics 
within the incomplete sampling intervals. Therefore, while having dk — > oo, Vfc, as fast as possible is 
practically desired since it corresponds to asymptotically low communication rates, there is a trade-off 
in determining its rate as can be seen in (|33l) and ( f35T >. Its rate is upper bounded by X, and yX for 
asymptotic unbiasedness/consistency, and asymptotic optimality, respectively (cf. second row of Table J). 
On the other hand, we want the number of bits, ry, to be as small as possible since it corresponds to 
low bandwith usage. To ensure asymptotic unbiasedness/consistency we can keep ry constant, whereas 
to ensure asymptotic optimality there is a lower bound, \og(^/X / dk), on its rate (cf. second row of Table 
J]). However, note that having the rate of dk arbitrarily close to \/X, which is the most practically desired 
choice for asymptotic optimality, we can have the rate of ry arbitrarily slow. 

To be able to compare DMLE with LT-DMLE, let us analyze the conditions on Rk in Theorem [T] 
shown in the first row of Table J] By definition, Rk is the average number of bits transmitted until 
the stopping time by sensor k, i.e., Rk = E[iV t *] ry. Note that is a renewal process since the 
received messages, {v%} n , are i.i.d., hence using Wald's identity we can write E[2V^] = tz/E[iJ], where 
it is known that t% = @{X) and E[i*] = @(dk) from (fl2l) and the proof of Theorem |3 respectively. 
Therefore, paraphrasing the first part of Theorem [Q we can say that DMLE is asymptotically unbiased 
and consistent if ry = 0(1), and df. — > oo so that dk = o(X), exactly the same set of conditions required 
for LT-DMLE (cf. Table J]). Similarly, we can rephrase the condition in the second part of Theorem Q] 
as ry = O(l), and dk — > 00 so that dk = o(X/\og yX). Note that for LT-DMLE, the condition on ry 
to achieve asymptotic optimality is similar to the one here for DMLE as ry can be made arbitrarily 
close to O(l) for LT-DMLE. However, for DMLE, the condition on dk to achieve asymptotic optimality 
is more favorable than the one for LT-DMLE [o(X/logVT) vs. o(Vx)l. In other words, in DMLE 
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the communication overhead can be asymptotically lower than that in LT-DMLE. On the other hand, 
LT-DMLE has a number of advantages in practice over DMLE, as discussed in Section IIV-AI 

B. Fading Channels 

Under fading channels, {yt}t,k are independent, but not identically distributed across sensors and time. 
Hence, the counting processes such as t\ v and Nj~. are not renewal processes, and thus we cannot use 
Wald's identity in our derivations in this case. Another challenge in this general case is that the stopping 
time, T, is random. In this section, we will first analyze the estimators LT-sDMLE and LT-dsDMLE 
based on level-triggered sampling, in the first four theorems, and then the estimators U-sDMLE and 
U-dsDMLE based on uniform sampling, in the last two theorems. Before proceeding to the theorems, we 
present a number of technical lemmas. From now on, E[-] will denote the conditional expectation given 
U\, e.g., E[Uf] = E[U t k \n%], or H t , e.g., E[U t ] = E[U t \H t }. 

Lemma 3. The expected stopping time, E[T"], tends to infinity at the same rate as X, i.e., E[T"] = @(X), 
if e k = o(T), Vk, for LT-sDMLE and LT-dsDMLE, and T v = o(l) for U-sDMLE and U-dsDMLE. 

Proof: The proof is given in Appendix E. ■ 
Let us now analyze, in the following lemma, the asymptotic growth rate of the average discrepancy 
between the global process Ut, and its approximation U t . 

Lemma 4. We have E[\Uj- — Uf\] = o(Z) if e k — > oo so that e k = o(Z), V/c, for LT-sDMLE and 
LT-dsDMLE with rjj = 0(1), and tjj — > oo at any rate for U-sDMLE and U-dsDMLE. 

Proof: The proof can be found in Appendix F. ■ 
In the last lemma, we will analyze the asymptotic growth rate of the expected conditional score function 
in absolute value. 

Lemma 5. We have E[\Sf\] = o(l) ife k = o(l 2 ), Vk, for LT-sDMLE and LT-dsDMLE, and T v = o(l 2 ) 
for U-sDMLE and U-dsDMLE. 

Proof: The proof is presented in Appendix G. ■ 
Now, we proceed to analyze the singly sequential estimator, LT-sDMLE. 

Theorem 4. Consider the sequential decentralized estimator LT-sDMLE, given in Section \IV-B1\ It is, 
as X —7- oo, asymptotically unbiased, i.e., E[xj- — x] — > 0, and consistent, i.e., Xj- A x, if R k — > oo at 
any rate, and e k — > oo at a slower rate than I, i.e., e k = oiX), \/k. 
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Proof: The proof is provided in Appendix H. ■ 

Theorem 5. The sequential decentralized estimator LT-sDMLE, given in Section \IV-B1\ is, as I — > oo, 
asymptotically optimal, i.e., .fl^(xj- — x) — > J\f(0, 1), if Rk — > oo at a faster rate than logv^, i-e., 



L f y T 

R k = w(log y/T), e k = o(\/T), and r v = u}(log(y/T/e k )), Vfc. 

Proof: The proof is given in Appendix I. ■ 
Next, we analyze LT-dsDMLE, in which, in addition to Uj-, Vf is also sequentially transmitted, as 
opposed to LT-sDMLE. 

Theorem 6. Consider the sequential decentralized estimator LT-dsDMLE, given in Section \IV-B2\ It is, 
as I — )• oo, asymptotically unbiased, i.e., E[xj- — x] — > 0, and consistent, i.e., Xj- A x, if dk — > oo, and 
ek — > oo at slower rates than I, i.e., dk = o{Z) and et = oiX), \/k. 

Proof: The proof can be found in Appendix J. ■ 

Theorem 7. The sequential decentralized estimator LT-dsDMLE, given in Section \IV-B2\ is, as I — > oo, 
asymptotically optimal, i.e., ^J~P^(xj- — x) -A A/"(0, 1), if dk = o(VT), ry = co(log(VT/dk)), ek = 
o{\fX), and rjj = uj(log(VT/ek)), VA;. 

Proof: The proof is presented in Appendix K. ■ 
Finally, in the following two theorems, we analyze U-sDMLE and U-dsDMLE, that are based on the 
conventional uniform sampling. 

Theorem 8. The sequential decentralized estimator U-sDMLE, given in Section \IV-B3\ is, as I —> oo, 
asymptotically unbiased, i.e., E[x^- — x] — > 0, and consistent, i.e., x~- — > x, ifrjj — > oo at any rate, Tjj = 
o(T), and Rk — >■ oo at any rate. Moreover, it is asymptotically optimal, i.e., ^Jl~j^{xj- — x) — >■ A/"(0, 1), 
if r u = w (log \M) awrf -Rfc = oj(log \/X). 

Proof: The proof is provided in Appendix L. ■ 

Theorem 9. The sequential decentralized estimator U-dsDMLE, given in Section \IV-B4\ is, as I — >■ oo, 
asymptotically unbiased, i.e., E[x^-—x] — > 0, and consistent, i.e., Xj- — )■ x, jfrj/ — )• oo andry — > oo at any 
rate, and Tjj = o(X), Ty = oil). Moreover, it is asymptotically optimal, i.e., ^J~P^(xj- — x) -4 M(0, 1), 
r l/ = w (log V^), ry = w (log = o(X), and Ty = o(VT). 

Proof: The proof is given in Appendix M. ■ 
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Asymptotic unbiasedness & Consistency 


Asymptotic optimality 


LT-sDMLE 


d k -> oo s.t. d k = o(T), e k -> oo s.t. e k = 
o(x J , Vfc, ry = ± J , and = C(l ) ( lam. |4]( 


d k ->• oo s.t. d fc = o(I/logy/T), e k = 
o(vl),Vfc, ry = C(l), and rjj = O(l) (inm. 


Ll-dsDMLh 


d fc -> oo s.t. d k = o(2"), e fc — 5> oo s.t. e fe = 
o(Z),Vfc, ry = O(l), and = O(l) (Thm.© 


d k = o(V2), e fc = o(V2), ry = 
oj(log(v^/d fe )), and = cj(log(VT/e fc )), Vfc 
( i nm. [_/ji 


U-sDMLE 


d k — > oo s.t. dfc = o(2"), = o(2"),Vfc, ry = 
O(l), and ru — > oo at any rate (Thm. [8} 


dfe — > oo s.t. dfc = o(I/ log \Z2T), efe = o(2"), 
r v = 0(1), and r v = w(log v^) (Thm. HJ 


U-dsDMLE 


dfc = o(X), e k = o(2"),Vfc, ry — >• oo at any rate, 
and ru — > oo at any rate (Thm. [9) 


d fc = o{yT), e k — o(2"),Vfe, ry = oj(logv^), 
and = c»j(log Vx), (Thm. |9) 


dk,e-k'. sampling thresholds in LT-dsDMLE, rv,ru'- numbers of bits sent with each message in LT-dsDMLE 



TABLE II 

Conditions for LT-dsDMLE, and paraphrased conditions for LT-sDMLE, U-sDMLE, U-dsDMLE 



Table [TT] summarizes the conditions required for the sequential decentralized estimators in fading 
channels to satisfy asymptotic unbiasedness, consistency, and asymptotic optimality. In order to make 
fair comparisons between the level-triggered-sampling-based estimators and the uniform-sampling-based 
estimators, we make the average rates of the received messages by the FC equal. Specifically, in the 
uniform-sampling-based estimators the average message rates are ^~ and ^~ since the FC receives K 
messages every Tjj and Ty units of time, respectively. For the level-triggered-sampling-based estimators, 
recalling that N t and M t are the numbers of messages until time t, we are interested in computing the 
limits lim^oo ^ and limt_>. 00 ^ as the average message rates for Vt and Ut, respectively. From ll26l 
Eq. (40)], we can write Hindoo ^ = Y,k=i and lim^+oo ^ = Y,k=i E\tf^]- Hence > we select 

the thresholds {d k } and {e k } so that X)fc=i fp~7 = W anc * ^f=i g[t* ] = 7F' res P ect ively. Assuming 
that {4} and {e^} are selected so that E[^y] = E[t*y],V7c, and E[^y] = EfiJ^-jjVfc, respectively, 
then we need to set E[^y] = Ty and E[^y] = Tjj. As shown in the proof of Theorem [6j we have 
E[t\ v ) = 0(4), and thus Ty = 6(4). Similarly, we can show that E[t{ v ] = T V = 8(e fc ). 

Now, comparing LT-sDMLE with U-sDMLE, and LT-dsDMLE with U-dsDMLE, we see that the level- 
triggered-sampling-based schemes achieve asymptotic unbiasedness and consistency by keeping rjj and 
ry constant, and having the average sampling intervals, i.e., Tjj and Ty, tend to infinity; whereas in the 
uniform-sampling-based schemes rjj and ry must tend to infinity, regardless of Tjj and Ty, to achieve 
asymptotic unbiasedness and consistency (cf. first column of Table UT]). Similarly, for asymptotic optimality 
it is seen that in LT-sDMLE and LT-dsDMLE the growth rates of ru and ry can be made arbitrarily slow 
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by having the rates of T\j and Ty arbitrarily close to s/I, whereas in U-sDMLE and U-dsDMLE the 
rates of rrj and ry must be faster than log VT, regardless of Tjj and Ty (cf. second column of Table [llj>. 
In other words, increasing the average sampling intervals does not help in U-sDMLE and U-dsDMLE to 
improve the asymptotic performance without increasing the numbers of bits transmitted by each sensor 
at each communication time, but does help in LT-sDMLE and LT-dsDMLE. 

The underlying reason for this fundamental difference in the asymptotic performances is that the 
overshoots in LT-sDMLE and LT-dsDMLE are bounded constants that do not depend on the average 
sampling intervals, and so are the quantization errors. Hence, the bounded quantization errors become 
negligible compared to the received messages v\ and £t„, given by (TT8T ) and d22l ). respectively, as the 
average sampling intervals tend to infinity, i.e., the thresholds d k and e k tend to infinity. On the other hand, 
in U-sDMLE and U-dsDMLE, the quantization subintervals, and thus the quantization errors gets larger 
as the communication periods, Tjj and Ty, tend to infinity, and rjj and ry stay constant. As a result, the 
level-triggered-sampling-based schemes have a significant advantage over the uniform-sampling-based 
schemes since it is practically desirable to have rjj and ry small for the sake of low bandwith usage, 
and Tjj and Ty large for the sake of low communication rates. 

Next, comparing LT-dsDMLE to LT-sDMLE, we see that the comments made for the sampling threshold 
d k in Section |V-Al hold here for d k in LT-dsDMLE, and e k in both LT-sDMLE and LT-dsDMLE. 
Specifically, we want to have e k — > oo as fast as possible to attain asymptotically low communication rates. 
However, the rate of e k is upper bounded by X and VT to achieve asymptotic unbiasedness/consistency 
and asymptotic optimality, respectively (cf. first two rows of Table HTJ). Similarly, we want to have rjj 
as small as possible to lower the bandwith consumption, and accordingly rjj can be kept constant to 
achieve asymptotic unbiasedness/consistency. For asymptotic optimality, even though the rate of r\j is 
lower bounded by \og(y[T j e k \ in practice it can be very slow, i.e., close to zero, if e k tends to infinity 
as fast as possible, i.e., close to y/T, as desired in practice. 

For LT-sDMLE, to rephrase the conditions on R k in the first row of Table [TIJ recall that R k = E[A^.] ry. 
Here in the fading case, the counting process iVj~. is not a renewal process, as opposed to the AWGN 
case, since the messages received at the FC are not i.i.d.. Nevertheless, we still have E[JVj~] = 0(1/ d k ) 
as shown in the proof of Theorem [6] Hence, we can recast the condition on R k to achieve asymptotic 
unbiasedness/consistency as ry = O(l), and d k — > oo so that d k = o(Z), \/k, which is identical to 
the condition for LT-dsDMLE as can be seen in Table JI] Similarly, the condition on R k to achieve 
asymptotic optimality can be rephrased as ry = 0(1), and d k — > oo so that d k = o(T / 'log \/T). Note 
that for asymptotic optimality, the lower bounds on ry in LT-sDMLE and LT-dsDMLE are similar to 
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each other, i.e., O(l), but the upper bound on dk in LT-sDMLE, i.e., X/log is more favorable than 
the one in LT-dsDMLE, i.e., \f%. On the other hand, LT-dsDMLE possesses practical advantages over 
LT-sDMLE, especially when X is large, as explained in Section IIV-AI 

VI. Simulation Results 

The asymptotic performances of the proposed decentralized estimators were analyzed in Section [V] In 
this section, we provide simulation results to compare their non-asymptotic performances. Throughout 
the section, we use ry = rjj = 1 to illustrate the case of most practical interest, i.e., to conform to the 
low bandwith usage requirement in decentralized systems. The thresholds {dk} and {e^} are determined 
to satisfy the given average sampling intervals Ty and Tjj, respectively. The upper bounds <fi and 9 are 



set as the 99-th percentiles of 



2»(( y ; )•/>;) 





2\h k I 2 

and 'qS' , respectively. 

For the AWGN case, our performance metric is the mean squared error (MSE), i.e., E[(x tx ~~ x ) 2 ]- 
And we plot it against four common parameters of both the centralized and the decentralized estimators, 
namely the stopping time tx, known to be deterministic; the number of sensors K; the signal-to-noise 

\h I 2 

ratio (SNR) of the channel SNR& = and the bounding constant of the parameter to be estimated, 
i.e., X where \x\ < X. Note that X represents the uncertainty level in x, and affects the value of the 
bounding constant <p, which defines the quantization intervals for V* in DMLE and q\ in LT-DMLE. 

On the other hand, for the fading case we use the expected stopping time E[T], as the performance 
metric, and we plot E[T] against MSE, K, SNR fe = ^1P , and X. 

A. AWGN Channels 

Fixed K, SNRk, and X, varying tx- Firstly, we set K = 5, SNR^ = 1 (0 dB) \/k, X = 5, and vary 



X = 25 x 2 m where m = 0, . . . , 5. Then, from (fT2l ) we have tx 



3, 5, 10, 20, 40, 80. We also 



2 K SNR 

increase the average sampling interval Ty as the stopping time increases to meet the low communication 
rate requirement, i.e., Ty = E[£* v ] = 2 x 1.4 m , Vfc. Recalling that Ty = @(dk) (cf. the proof of 
Theorem [6]), we see that the rate of Ty complies with Theorem |3l and also Theorem [T](cf. the discussion 
at the end of Section |V-A| ). In other words, the rate of Ty (resp. dk,Vk), which is 1.4, is smaller than 
but close to the rate of which is y/2. We keep the number of communication bits constant (ry = 1) 
in accordance with Theorem Q] and Theorem [3] Hence, we maximize the performances of DMLE and 
LT-DMLE while conforming to the low communication rate and low bandwith usage requirements. 

In Fig. [TJ it is seen that with short stopping times (up to tx = 20) LT-DMLE, following the sequential 
approach based on level-triggered sampling, performs significantly better than DMLE, that follows the 
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Fig. 1. Mean squared error (MSE), i.e., E[(x tx - x) 2 ], vs. the stopping (estimation) time, i.e., tx, for the optimal centralized 
estimator, DMLE, and LT-DMLE with rv = 1. 

fixed-time approach. However, when the stopping time becomes longer, DMLE outperforms LT-DMLE, 
and even reaches the optimal centralized estimation performance at tx = 80. This is due to the fact that 
the number of bits DMLE uses to quantize V£i, i.e., = E[Nt] ry, increases as the stopping time t% 
increases since the average number of messages transmitted in LT-DMLE until t%, i.e., E[A^], increases 
with increasing tx- Accordingly, after tx = 80, Rk becomes large enough that V^. is fully recovered 
at the FC, i.e., V t k x = V* . In other words, the decentralized DMLE becomes the optimal centralized 
estimator. As pointed out in Section IIV-AI DMLE does not meet the low bandwith usage requirement, 
whereas LT-DMLE conforms to it by sending only 1 bit in each sampling instant. Furthermore, DMLE 
provides no early estimates, whereas LT-DMLE does. 

Fixed tx, SNRk, and X, varying K: Secondly, we set t% = 15, Ty = 5, SNR& = dB, VA;, X = 5, 
and vary K = 2, . . . , 10. We plot the MSE vs. K with ry = 1 and ry = 2 in Fig. [2]-a and Fig. |2]-b, 
respectively. With ry = 1, the case of most practical interest, it is seen that the optimal centralized 
estimator has an MSE decaying with rate 1/K, but DMLE and LT-DMLE, the latter being superior, have 
MSEs decaying with rates slower than 1/K. The quantization error (resp. overshoot) problem caused 
by small number of bits prevents DMLE (resp. LT-DMLE) from fully benefiting the increasing number 
of sensors. However, when ry = 2, the MSE of both schemes seem to decay with rate 1/K, as shown 
in Fig. [2]-b. In this case, DMLE, consuming high bandwith at time tx, attains the performance of the 
optimal centralized estimator. 
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rv = l 



r v = 2 



- B - Centralized 
— 8 — DMLE 
— • — LT-DMLE 




Fig. 2. Mean squared error (MSE), i.e., E[(x tx - xf], vs. the number of sensors, i.e., K, for the optimal centralized estimator, 
DMLE, and LT-DMLE with (a) rv = L (b) rv = 2. 



- B - Centralized 

DMLE 
— •— LT-DMLE 




10 

SNR (dB) 



Fig. 3. Mean squared error (MSE), i.e., E[(x tx — x) 2 ], vs. SNR, i.e., ^ fc J , for the optimal centralized estimator, DMLE, and 
LT-DMLE with r v = 1. 

Fixed tx, K, and X, varying SNRk- Thirdly, we set tx = 15, Ty = 5, K = 5, X = 5, and vary 
SNR fe = -20, -10, . . . , 30 dB, Vfc. In Fig. [3j it is seen that the MSEs of DMLE and LT-DMLE decay 
with decreasing rates as SNR increases, and even that of DMLE stops decreasing after SNR= 10 dB. 
This is because the quantities to be transmitted to the FC, i.e., V t k in DMLE and v k n in LT-DMLE, take 
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Fig. 4. Mean squared error (MSE), i.e., E[(xt x — a;) 2 ], vs. the bounding constant of x, i.e., X, for the optimal centralized 
estimator, DMLE, and LT-DMLE with rv = 1. 

larger values as SNR increases. As a result, the quantization errors in DMLE with constant grows 
considerably, causing the improvement in the MSE performance to diminish. LT-DMLE is less affected 
by this phenomenon since via 1 bit a significant part of v„, i.e., = [cf. (fT8l)1, is transmitted in 

any case although the overshoot, q%, grows with increasing SNR. Consequently, at high SNR LT-DMLE 
significantly outperforms DMLE. 

Fixed tj, K, and SNRk, varying X: Lastly, we set t% = 15, Ty = 5, K = 5, SNR& = dB, V/c, 
and vary X = 5\/l0™ where m = -2, . . . , 2. It is seen in Fig. [4] that the performance of the optimal 
centralized estimator is not affected by the increase in the uncertainty in x since no quantization takes 
place, i.e., all local observations are available to the FC. On the other hand, those of DMLE and LT- 
DMLE, using constant number of bits, are deeply affected since quantization errors and overshoots grow 
with the increasing X, respectively. In LT-DMLE, with small values of X, e.g., X = 0.5, the overshoot, 
q^, is negligible compared to the magnitude of the transmitted value, \v%\ = df., hence we observe a 
performance close to the optimal one, and much better than that of DMLE. However, as X increases, 
after X = 5Z, dominates = dk + q„, i.e., q\ S> dk, and thus the performance of LT-DMLE 
diverges from the optimal performance, and stays close to that of DMLE, which also diverges. 
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Fig. 5. Average stopping time, i.e., E[T], vs. MSE, i.e., E[(xt T — x) 2 ], for the optimal centralized estimator and the six 
decentralized estimators with Tv = Vv = 1. 

B. Fading Channels 

Recall that under fading channels the sensor k needs to transmit two random processes, namely, U k 
and V t . The former should be sequentially transmitted since it determines the stopping time. Hence, we 
have two options to transmit U k , namely, the conventional uniform sampler followed by a quantizer and 
the level-triggered sampler. On the other hand, we have three options for V k as it can also be transmitted 
non-sequentially (at once) at the stopping time. As a result, we can construct six different decentralized 
estimators under fading channels, four of which are described in Section IIV-BI and analyzed in Section 

Fixed K, SNRk, and X, varying MSE: In this subsection, for the sake of completeness we will provide 
simulation results also for the remaining two estimators, which are the combinations of uniform sampling 
and level-triggered sampling, namely, LU-dsDMLE (using level-triggered sampling for U k , and uniform 
sampling for V k ) and UL-dsDMLE (using uniform sampling for U k , and level-triggered sampling for 
V t k ). As in the AWGN case, we set K = 5, SNR fe = OdB, VJfe, X = 5, and vary X = 25 x 2 m , 
T v = E[t* y ] = 2 x 1.4 m , \fk where m = 0, . . . , 5. We also set T v = T v . 

We plot E[T] vs. MSE in Fig. [5] The first important observation is the poor performance of the 
schemes using uniform sampling to transmit V t k (LU-dsDMLE and U-dsDMLE). This is because the 
local incremental process v^ T , which forms the m-th message from the sensor k, can take both negative 
and positive values, that is, v^j, 6 (—Ty(j), Ty<j>) (cf. Section IIV-B4I ). and with ry = 1 it cannot be 
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accurately quantized. On the other hand, Fig. [5] shows that in the schemes using level-triggered sampling 
for V t k (LT-dsDMLE and UL-dsDMLE) ry = 1 suffices to represent the local process well enough 
at the random sampling time t\ v [cf. ([Toll. 

Note that the two schemes LU-dsDMLE and UL-dsDMLE considered only in this section, which are 
the combinations of uniform sampling and level-triggered sampling, perform worse than their "pure" 
counterparts U-dsDMLE and LT-dsDMLE, respectively. This is due to a compatibility problem between 
Ut and Vt in these "mixed" schemes. We observe such a problem since the decentralized estimates in 
this paper are computed as the ratio of Vt to U t , and when U t and Vt are computed via different methods 
(i.e., one via uniform sampling and the other via level-triggered sampling), quantization errors in U t and 
Vt are of different orders of magnitude. To put it another way, in a "pure" scheme, like LT-dsDMLE, the 
quantization errors in U t and Vt are of the same order of magnitude, and taking the ratio they compensate 
each other better than in a "mixed" scheme, like UL-dsDMLE. Hence, disregarding the "mixed" schemes 
we see that the doubly sequential LT-dsDMLE based on level-triggered sampling significantly outperforms 
the doubly sequential U-dsDMLE based on uniform sampling, which are of special interest to us as only 
the doubly sequential schemes enable low bandwith usage. 

The singly sequential schemes LT-sDMLE and U-sDMLE, using much higher bandwith than their 
doubly sequential counterparts LT-dsDMLE and U-dsDMLE, improve their performance and outperform 
LT-dsDMLE after some point as the target MSE gets smaller. This is expected since LT-sDMLE and U- 
sDMLE use more and more bits (i.e., consume higher and higher bandwith) to transmit Vj? as the stopping 
time T grows. Hence, in fact, the comparison between the singly sequential schemes and the doubly 
sequential schemes is not completely fair. Note that here in the fading case, the performances of LT- 
sDMLE and U-sDMLE do not converge to that of the optimal centralized scheme (unlike DMLE in Fig. 
[D since Ut is sequentially transmitted with a constant number of bits, rjj = 1. Finally, at moderate and 
high MSE values, we observe the same compatibility problem of "mixed" schemes in LT-sDMLE since 
a conventional quantizer is used to transmit V£, whereas C/-L is transmitted via level-triggered sampling. 
Therefore, U-sDMLE, using conventional quantizers in transmitting both V-. and U~. (although the latter 
is sequentially transmitted), performs better than LT-sDMLE at moderate and high MSE values. However, 
at low MSE values the singly sequential schemes practically transmit V^. exactly (as the number of bits 
Rk gets larger), eliminating the compatibility problem in LT-sDMLE, and thus LT-sDMLE outperforms 
U-sDMLE, demonstrating the superiority of level-triggered sampling over uniform sampling in another 
way. 

Fixed MSE, SNR^, and X, varying K: Henceforth, for the sake of clarity and brevity, we will only 
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Fig. 6. Average stopping time, i.e., E[7~], vs. the number of sensors, i.e., K, for the optimal centralized estimator, LT-dsDMLE, 
and U-dsDMLE with ru = 1, rv = 2. 

consider the "pure" doubly sequential estimators, LT-dsDMLE and U-dsDMLE, since the singly sequential 
estimators, LT-sDMLE and U-sDMLE, violate the low bandwith usage requirement, and the "mixed" 
doubly sequential estimators, LU-dsDMLE and UL-dsDMLE, suffer from the compatibility problem. 
Next we set MSE = 1(T 2 , SNR fc = OdB, Vfc, X = 5, and vary K = 2, . . . , 10. To make the MSEs of the 
optimal centralized estimator, LT-dsDMLE and U-dsDMLE equal to the target value, the target Fisher 
information and the average sampling intervals, for each scheme, are determined as X = 25 x 2 s , and 
Tjj = Ty = E[i* v ] = 2 x 1.4 s , VA: where seK. Note that for each scheme s takes different values in 
general. We will use ru = 1, as before, but ry = 2 from now on to enable U-dsDMLE to achieve the 
target MSE (see Fig. |5]>. 

As shown in Fig. |6l the average stopping time of the centralized scheme decays with a rate close to 
1/K, whereas those of LT-dsDMLE and U-dsDMLE, the former being faster, are slower than 1/K for 
the same reason as in the AWGN case. Recall that in the AWGN case ry = 2 was sufficient for the 
decentralized schemes to enjoy the increasing sensor diversity completely (see Fig. HJ-b). However, here 
under fading channels ry = 2, together with rjj = 1, does not suffice to alleviate the quantization error 
problem to fully exploit the increasing sensor diversity. 

Fixed MSE, K, and X, varying SNR k : We set K = 5, and vary SNR fc = -20, -10, . . . , 20 dB, Mk. 
It is seen in Fig. [7] that the average stopping times of the centralized estimator, LT-dsDMLE and LT- 
dsDMLE decrease with the increasing SNR, as expected, but the rates of LT-dsDMLE and U-dsDMLE 
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Fig. 7. Average stopping time, i.e., E[T], vs. SNR, i.e., % , for the optimal centralized estimator, LT-dsDMLE, and 
U-dsDMLE with r v = l,r v = 2. 

slow down for the same reason as in the AWGN case. The quantities to be transmitted become larger 
as SNR increases, hence with constant rjj and ry the quantization errors and overshoots get larger, 
slowing down the performance improvement. We observe that the average stopping time of U-dsDMLE 
is likely to stop decreasing after 20 dB, whereas that of LT-dsDMLE continues to decrease since the rate 
of increase of the overshoots in this case is slower than that of the quantization errors in U-dsDMLE, 
demonstrating another advantage of LT-dsDMLE over U-dsDMLE. Specifically, U-dsDMLE quantizes 
u mTu e IP>^E/0) an d v mT v e ( — ^V0> ^V0) where 6 and increase with the increasing SNR. On the 
other hand, the overshoots in LT-dsDMLE are confined to [0,9) and [0,0) (cf. Section JV]). 

Fixed MSE, K, and SNR^, varying X: Lastly, we vary X = 5\/10 TO where m = —2, ... ,2, setting 
the other parameters to the same values used in the previous subsections. Fig. [8] shows that the average 
stopping times of the decentralized schemes diverge from that of the centralized scheme as X increases 
since the overshoots and the quantization errors grow with increasing X in LT-dsDMLE and U-dsDMLE, 
respectively, as described in the AWGN case. In particular, we observe that increasing X causes (j) to 
grow, hence as explained in the previous subsection the quantization errors in U-dsDMLE grow much 
faster than the overshoots in LT-dsDMLE as X increases. Accordingly, U-dsDMLE diverges much quicker 
than LT-dsDMLE, as shown in Fig. [8] 
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Fig. 8. Average stopping time, i.e., E[7~], vs. the bounding constant of x, i.e., X, for the optimal centralized estimator, 
LT-dsDMLE, and U-dsDMLE with r v = 1, rv = 2. 

VII. Conclusion 

We have proposed and rigorously analyzed a new decentralized estimation framework based on a non- 
uniform sampling technique, namely level-triggered sampling. Level-triggered sampling, eliminating the 
need for quantization, produces a single bit, and thus provides an efficient way of information transmission 
in decentralized systems. It is used in the proposed estimator to effectively report local observations at 
sensors to a fusion center (FC). Messages received from sensors are combined at the FC to compute 
approximation(s) to global random process(es) that characterize(s) the centralized maximum likelihood 
estimator (MLE), shown to be optimal. The proposed estimator operates under both non-fading and fading 
listening channels (through which sensors collect their observations), which makes it unique among the 
decentralized estimators in the literature since fading listening channels have been considered for the 
first time in this paper. Performing an asymptotic analysis we have determined the conditions under 
which the proposed estimator and the decentralized estimator based on conventional uniform sampling 
are asymptotically unbiased, consistent and asymptotically optimal. In particular, it is sufficient for the 
proposed estimator to have average communication (sampling) intervals tending to infinity at rates lower 
than specific upper bounds, and transmit a constant number of bits at each communication time. On 
the other hand, for the scheme based on uniform sampling the number of bits transmitted at each 
communication time has to tend to infinity at rates faster than specific lower bounds, regardless of the 
average communication intervals. Since for the sake of low bandwith usage and low communication rate 
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it is practically desired to have the number of bits as small as possible, and the average communication 
intervals as large as possible, the analytical results clearly demonstrates the superiority of the proposed 
scheme over the conventional scheme. Simulation results further demonstrate the superior non-asymptotic 
performance of the proposed scheme based on level-triggered sampling under different conditions. 

Appendix A: Proof of Lemma Q] 

From (TTOb it is seen that x t is conditionally unbiased. Consistency and efficiency follow from (TTOb 
and CTJJ>- We have Ep t - x) 2 \H t ] = Vai(x t \H t ) = 1/U t = i.e., efficiency. If we have U t a -4' 

oo, i.e., P(lim t ^ 00 Ylk=i YX=i = oo) = 1, then x t ^> x implying x t -4 x given % t , i.e., 

consistency. If P(lim t _ !>00 Y^k=i St=i 2 ^ = oo) / 1, then there exists some M < oo such that 
P (Hindoo £f =1 Y! t =i ^ < M ) + 0. Hence, it suffices to show that P(lim^ oc £f =1 E*=l < 
M) = 0, VM < oo. Note that 

/ K 1 o|/,fc|2 \ / K t 9 ,,fc|2 \ 

= tapL P (-]r£^)>exp(-M)) 



< lim 



exp(-Efc=i^ 



K 2\h<l\ 2 
3~ 



t->oo exp(-M) 

where the last inequality follows from Markov's inequality and the fact that < z2k=i J> r 316 i-i-d - 
Now, since £f =1 « > 0, exp(- £f =1 ^) < 1 and E[exp(- £f =1 ^)] < 1- Hence, 
lim t _^ 00 (E[exp(- £^ ^r-)])* = 0, concluding the proof. 

Appendix B: Proof of Theorem Q] 
To prove the first part of the theorem it is sufficient to show that x tx — > x, i.e., Eflx^ — x|] — ^ 0, since 

L 1 p 

xt x ~~ ^ x implies both X( x — > x, and E[xt x — x] — > 0. Since we have E[\x tx — x\] = E[\x tx —xt x +xt x —x\] < 

L 2 

E[\xt x — £t x \] + E[\xt x — x\], and E[\xt x — x\] — > implied by x tx — > x from Lemma [1] we only need to 
show that E[\x tx - x tx \] ->■ 0. Using © and £l5]> we write \x tx - x tx \ = as U tx > [cf. ©]. 

From ® and (dU) we have 

|x tl - x tx \ < ^ k=1 ' fa (29) 

Using a reasonable quantizer we have the quantization error bounded as — V k x \ < |V^|/2 Rfc . Recall 
from Section|lV3I]that 2 W/ fe ") < ^ Vfc, t. Then, we have < \V£.\ < t X (j), implying \V t k x \ = 0(t x ) 
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and E[|V^|] = 0(tx). Taking the expectations on both sides in d29l ) we write 

k=l 1 k=l 

^ TV 2 I /l I 2 

since we have U tx = txEk=i J>„ = Hence, it is sufficient to have — > oo, Vfc, at any rate 

for asymptotic unbiasedness and consistency. 

For asymptotic optimality, note that I tx = Ut x , and we can write ^Ut x (xt x — x) = ^Ut x (x tx — 
xt x ) + \/Ut x (xt x — x). From (TTQb we have ^/Ut x (x tx — x) ~ A/"(0, 1). Hence, it is sufficient to show 
that ^/Ut x {xt x — x tx ) — > as X — > oo. From d29l ) and (l30l) we can write 

t/ tx \x tI -& tx \< j2o(i)y-^, (3D 
fc=i 



implying that ^/U^(x tx - x tx ) ->■ if R k = w(log y/Ut^), Vfc. We have from (O that T < U tx < 
% + ' implying that fJ^ = X + 0(1), and hence the result in Theorem [T] 

Appendix C: Proof of Theorem [2] 
As stated in the proof of Theorem [Q it is sufficient to show that E[\xt x — xt x \] ~^ 0- Note from ((8]) 
that V£ = En=l v n + E%t" +1 2my ;T K) , and from (H that v£ = "n- Thus ' fo ll° win g dH 

N fc ■ " fc 

we have 

l % _ 4fel < ^il^iK v n \ + ^ ; (32) 



where in the second term of the right hand-side we can write 



t x 2®((y*Yh k ) 



Etx 
T=t k . +1 



< dh since it is 



known that no sampling occurs between the last sampling time, i^ fc , and the stopping time, tx- Taking 
the expectations of both sides in (l32l) and noting that Ut x = tx EJk=i in the AWGN case, we write 

Elfe - M < ^^(E EE -f + £)£)■ <«) 

Z^fc=l o-a fc=l fe=l 

where on the right hand-side the term outside the parentheses is a constant that does not depend on tx- 
In the first term inside the parentheses, \v* — v*\ is the quantization error in absolute value of the ra-fh 
message from sensor k. Noting that = XX"=t fc +i 2 ^ y j) hk \ we see that — v*| depends only on 
the observations in the n-th intersampling period, i.e., {y^}, r G (*n-i> anc ^ tnus {l^n — ^nlln arc i-i-d.. 

7V fc 

Hence, the term En=i ~ v n\ i n < f33T ) is a renewal reward process. Note from (fT2T > that ij = 8(X), i.e., 

JV^_ fc k k 

tx — y oo as X — y oo. Hence, from 041 Theorem 3.6.1] we 

have E[E„ = y;-<|] ^ IliggiO as X ^ oo, 
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where E[iJ] is the average sampling interval of sensor k. Then, it is sufficient to show that 

^fef^O, and £-0,V*, (34) 

as X — ^ oo. If dk — > oo so that dk = o(I), i.e., dk = o(tx), Vfc, then both conditions in (l34l) will be 
satisfied since Efjwf — v\ \] = 0(1) for r v = Oil), and E[tf] — > oo as 4 — > oo [cf. (TToTll. concluding 
the proof. 

Appendix D: Proof of Theorem [3] 

As stated in the proof of Theorem [Q it is sufficient to show that \fUt x [xt x — xt x ) — > as X — > oo. If 
we show that y/Ut x E[\xt x — %tx\] — > 0, then from Markov's inequality we will have ^Ut x \x tx — x tx \ — > 0, 
which implies \/Ut x (xt x — xt x ) — > 0. From (l33l) and the discussion following it, we can write 

V^xEIK-^i] <o(i)(^ ^ + 2^ o^q J (35) 

as X — )• oo. Since j 25R ((^) M j ^ i id^ from (fl6l) t\ is a renewal process. Hence, using Wald's 



identity we can write E[v\] = E[tf]E ™^ tJ where E 2Jt(( ^ } hk) = O(l) (cf. Section EEATJ). At 
each sampling time, either crosses 4 or —dk, hence its expectation is given by E[v\] = (1 — otk){dk + 
E[gk|z;k > 4]) + afc(— 4 — E[qf\vf < —dk]), where = P(i>„ < —dk), and is the over(under)shoot 
bounded by <fi as given in Section IIV-A2I Therefore, we have E[uj=] = 6(4) and E[tf] = 9(4). Note 
that by using a reasonable quantizer, the quantization error \v\ — v\\ = \q\ — q\\ is bounded by </>/2 TV_1 , 
hence E[\vf — v\\] < 4>/2 Tv ~ 1 . Accordingly, we rewrite 051 ) as 

WMi, - M < 0(1) ( E 9( ^' + f -* ) . (36) 



fc=l fc=l 
which concludes the proof since ?7t z = X + 0(1) as shown in the proof of Theorem [T] 

Appendix E: Proof of Lemma[3] 
We start with the level-triggered sampling based estimators, and continue with the uniform sampling 

2\h k \ 2 i i 

based ones. In addition to the upper bound, 9 < oo, for 1 y (cf. Section IIV-B lb . assume further a 
lower bound 9 > 0, so that we have 9 < ^ Ji < 9, \/k,t. Since the overshoot p„ cannot exceed 
9, the incremental process u^, and accordingly its quantized value u k n are upper bounded by e& + 9, 
i.e., efc < < efc + 9. Note that Uj- cannot exceed the target Fisher information, X, by more than 
Sf=i( e fc + 0)' m which case all sensors transmit their largest possible messages at the stopping time. 
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Hence, we write 

K M\ K 
k=l n=l k=l 

followed by 

K 

J> fe + 0)M|>X, (38) 

k=l 

K K 

and ^e k M\ <l + ^(e k + 9). (39) 

k=l k=l 

Consider the intersampling interval, r k v = t k n v — v , for which, similar to d58l ). we can write 

M\ Mi+1 

Y J <u<f< <u- (40) 

n=l n=l 

Note that t* v is lower and upper bounded by e k jQ and (e*. + 0)/9, respectively. Hence, from (l40l) we 
write 

— ^— f — 1 < Mi < — f . (41) 
Substituting the lower and upper bounds in (|4TT) into (l39l and (1381 ). respectively, we get 

f fife + e 2 V — ^ > x (42) 

k=l k k=l 

from which it is seen that T = ©(X) and E[T] = ©(I) if = o(X), concluding the proof for LT-sDMLE 
and LT-dsDMLE. 

Note that, for U-sDMLE and U-dsDMLE, when the scheme stops at time T, the overshoot over X is 
upper bounded by KTjjO, corresponding to the worst case scenario in which UtM--i)T v i s j us t below 
X, and all sensors transmit the largest message possible at time T. Here, M- = T '/Tjj is the number of 

sampling times until T. Hence, similar to (|37l ) we write 

K M-j- 

X ^EE^ <T + KT v e. (44) 

k=l m=l 



Since T V B_ < u k mTu < T v 6, we have 



X X 

< M- < — — + -, (45) 



KTvB t KTjjO 6' 
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and thus 



m <f< w> + T> T "- <46) 



Since K, 9, and 9 are all O(l), we have f = 0(1), and E[f] = 0(1) if T v = o(l), concluding the 
proof. 



Appendix F: Proof of LemmaH] 
We present the proofs first for LT-sDMLE and LT-dsDMLE, and then for U-sDMLE and U-dsDMLE. 

Ajfk ^Z- o I L, fc 1 2 — ■ A/7" 

Note that we have U% = J2 n =i u n + Er=t fc +i and from <E3 that U% = En=l Hence > we 



write 



E[|?7 t -C/ t |] < Efc=iE[En=iK 



A' 



1 

<^E=#+£f> (47, 



+ 



^*=i L ^=ti ik u+ i^i 



2 

k=l k=l 



where we used \vt — v!l\ = |p* — p£| < (cf. Section HV-Blb . and +i < e ^ ( s i nce 

no sampling occurs between t k Mh u and T. From (|4T1 ). we can write — j < < - Having 

f = 9(2) and E[T] = 6(2) from Lemma [3 and \ = o we write ^ = (f) and ^fl- = 



( — ) , hence we have the result in Lemma @] for LT-sDMLE and LT-dsDMLE. 



For U-sDMLE and U-dsDMLE, similar to ([60]) we write 

E[|£V-KH] . KTyd E[M f ] 

1 < 2^2 ' (48) 

as we have |u^ T[7 — u k mTu \ < Tjj<j)/2 ru with a reasonable quantizer. Note that in (l48l ) we lack the term 
representing the missing information at the FC between the last sampling time and the stopping time, e.g., 
the second term in d60l ), since we have T = Mj-Tjj. Then, from Lemma[3]we have E[M^] = Oil/Tjj). 
Consequently, (|48T ) tends to if rjj — > oo at any rate, concluding the proof. 
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Appendix G: Proof of Lemma[5] 

We again start with the level-triggered sampling based estimators, then continue with the uniform 
sampling based ones. Using the Cauchy-Schwarz inequality we write 



ENflWl \/^ V^f 



< v ^ = ^ (49) 



4 E[tr t ] + & Ef=i E[M*] + Ef=i e fc 

< y ^ — ( 5 °) 



c Jl i Ef = i(2e fc + 6>) g E? =1 E[Mfl 

where we used (07J to write d49l . and Uf - < 1+ Y,k=i( e k + Q) (cf- Appendix A) to write ([50]>. Recall 
from Appendix B that ll^frl = (■f), hence ((5B tends to if e k = o(l 2 ), Vfc. 



For U-sDMLE and U-dsDMLE, using (|49J, from d44j) and (gSJ) we can write 




1 KTjjO K6 E\Mtt 



. , + "~ -i — 1 — Zi f52) 

1 \j 1 I 2 TvZ 1/Tu ' v ' 

which tends to if Tjj = o(Z 2 ), concluding the proof. 

Appendix H: Proof of Theorem[4] 

As in Theorem Q] and Theorem |2l we will show convergence in mean, i.e, E[\xj- — x\] — > 0, to prove 
the theorem. Note that we can write Xf — x as 

Now, as we did before in Theorem[T]and Theorem|2j we add and subtract Xj- inside the parentheses, i.e., 
if — x = — Xf + Xf — jf-x^ . Replace the first Xf with ^t, and the second one with x + jj^. 

ibu 

sides we get 



After distributing through the parentheses, and taking the absolute value and the expectation of both 

Enit _ l0 <!Mr^ + !fcM H + ![M, (54) 



since Uj- > I. If — > oo so that e& = o(Z), Vk, assuming \x\ < oo, the second term on the right hand- 
side of (l54b tends to zero, following from Lemma UJ and similarly, the last term tends to zero, following 
from Lemma [5] For the first term, from the proof of Theorem [TJ we write lH^j^J] < Y^k=i E j2^ ■ 
and E[\VM] = 6(E[Tj), and thus from Lemma [3j E[\VM] = 8(J). Hence, the first term on the right 



hand-side of (1541 ) tends to zero if Rk — > 00 at any rate Mk. 
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Appendix I: Proof of Theorem[5] 



Since 1% = Uj-, from the proof of Theorem [4] we can write 



u, 



T 



a 



v T + u f 



T 



T 



x + 



S, 



T 



T 



T , 



Note from Section HIT1 that 



Af(0, 1). Hence, it is sufficient to show that — 



1, 



and u t _^ o as X ->■ oo. It is shown in ([37]) that X < Uf < X + Efc=i( e fc + e )> hence 



efc = o(X). From (l47l) we can write 



(55) 

-+ 0, 
X if 

(56) 

fc=i " fe=i fe=i 

Using X and X + E/!^=i( e fc + 9) as the lower and upper bounds for U~-, respectively, and noting, from 
Appendix B, that M-. = G(X/e fc ), we see from (1561 ) that Uj- — > X if e*. = o(X), and either e& or r;y tends 

0. From 



A 



K 



u. 



T 



to infinity VA;. Therefore, B- — > 1, and it is sufficient to show that Vl 1 



0, and u r- u r 



the proof of Theorem HI we have ^ ^ r ^ — )• if 2 R 



oo faster than y/T, i.e., = w(log\/X"), 



implying, by Markov's inequality, ^ Vt ^Y t ^ — )• 0, which in turn implies Vt ^ t — > 0. Similarly, from (l47l ) 



we see that 



E\\U*--U. 



Vx 



Jl -> if e fc = o(VX) and 



ru 



(log(VX/efc)), VA;, concluding the proof. 



Appendix J: Proof of Theorem [6] 

LT-dsDMLE differs from LT-sDMLE only in the transmission of Vj-, hence the proof of Theorem [4] 
up to and including d54l applies here. Moreover, as in Theorem 01 if e& — > oo so that = o(X), VA;, 

from Lemma [4] and Lemma [51 respectively. For 



we have u r — 21 



0, and ^LL 



different from Theorem [4] similar to ( f32b . we write 

- r- ~ — i v~^A rfv^^ I ~ h I 

E [\V f - V f \] < E fe =i E [ L„Ii K - < 



X 



< 



E 



A 

fc=l 



E 



r 



2»(( y ;)'ft;) 



2 rv 



X 

f Mi 

fc=i 



+ 



X 



X 



fc=i 







(57) 
< d k (cf. 



where we used — v^\ < <p/2 rv 1 (cf. proof of Theorem [3j, and E T =t fc - 

r 

tends to zero if c4 = o(X), VA;. 



proof of Theorem 111). The second term on the right hand-side of 



A +k 
V — l n,V ' 



t 



„_ LV . happens when } , 



Note that the shortest intersampling interval, 
(*n-i vl> nave tne same S1 § n an d the largest possible magnitude, which is less than 0, i.e., t^ v > 
Vn, At. Assume that t^ v < oo, Vn, A;, and there exists a constant C such that r^y < C%, Vn, A;, 
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where 1 < C < oo. Using these bounds on t^ v we can write 

i2<v<f < J2 <v> (58) 



n=l n=l 



-T-KNl< 4-f , 



r d fc 

E[f] l.E^^g^ (59) 



Cdfc XX X df. X 

Since E[T] = 0(20 from Lemma [3j and ^ = f° r ^fc = °(20> we have E ^ r -^ = 
substituting which in d57l ) we conclude the proof. 

Appendix K: Proof of Theorem [7] 

The proof follows that of Theorem [5j except for the part showing Vt ^ t since LT-dsDMLE 

differs from LT-sDMLE only in the transmission of Vf. From (1571 ) and the discussion following it, 

we have ^£^V!] — >. fj if dfc = o(\/X) and ry = u(log(\/X/dk)), implying, by Markov's inequality, 
\ v t^Yt\ _^ an( j th us Vt ^Y t — > 0, concluding the proof. 

Appendix L: Proof of Theorem [8] 

The first part of the proof is identical to the proof of Theorem [4] To show asymptotic optimality, we 
start with (f55j. Similar to Theorem [5j we will first show that U~- — > X and Uj- — > X. From d44l) . we see 
that Uf^-X since T v = o(X). From <gg), we can write U f - Ke ^ Tu < U f < U f + Ke ^ Tu , where 
Mj-Tjj = T = @{X) from Lemma |3j Hence, Uj- — > X if rjj — > oo at any rate. Now, we need to show 
that Vt ~^ T -> 0, and ->• 0. From the proof of Theorem H we can show that E [l^ y rll ^ q if 

Rk = uj(logVX), which implies Vt ^ t — ^ by Markov's inequality. Similarly, from (l48l ) we see that 
E[\Ur-_Ur\] _^ ^ and thus u r~^t Q if Tu = w(log y/T). Note that we also need to have T v = o(X) 
for Lemma [3j to be valid, which concludes the proof. 

Appendix M: Proof of Theorem[9] 
Since U-dsDMLE differs from U-sDMLE in only the transmission of V]?, the proof follows that of 
Theorem [8] except for the part showing ^^ T T — > and ^[l^ ^l] _^ q jj ence) similar to (1571 ). we 
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write 



fe=i 



E 



m=l I "mT v 



X 



+ 



V" f lV f 2»((yj;)'ft;) 

l^k=l c | 2-jt=N t T v +1 



X 



< 



(60) 



2 rv-lJ J ' 

where we used the facts that \v^ T —v^ nT \ < 2Ty(f)/2 rv with a reasonable quantizer, and T — N^-Ty < 
Ty. Note that f/T v - 1 < N f < t/T v , hence we have E[7V^] = 0(X/Xy) from Lemma [3] Thus, 
from ((60b we have ^'V^ -> if Tv = o(X), and ry -> oo at any rate; and ^D^" 1 ^ 1 ] _ > q if 



Ty = o(\/X), and 



Ty = UJ 



(log a/X), concluding the proof. 
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